03. Catch Game Sample

Catch

The [catch]() sample is a simple C/C++ program which links to the reinforcement learning library provided in the repository. The environment is a 2-dimensional screen. A ball drops from the top of the screen and the agent is supposed to “catch” the ball before it hits the bottom of the screen. It’s only allowed actions are left, right or none.

catch Implementation

The catch.cpp code main procedure consists of the following sections:

  • initialize and instantiate a dqnAgent
  • Allocate memory for the game
  • Set up the game state (ball location)
  • Game loop
    • Update the game state
    • Get the agent’s next action with NextAction()
    • Apply the action to the game
    • Compute the reward
    • Exit if game over

The dqnAgent is initialized with parameters defined at the start of the module:

    // Create reinforcement learner agent in pyTorch using API
    dqnAgent* agent = dqnAgent::Create(gameWidth, gameHeight, 
                       NUM_CHANNELS, NUM_ACTIONS, OPTIMIZER, 
                       LEARNING_RATE, REPLAY_MEMORY, BATCH_SIZE, 
                       GAMMA, EPS_START, EPS_END, EPS_DECAY,
                       USE_LSTM, LSTM_SIZE, ALLOW_RANDOM, DEBUG_DQN);

As with the OpenAI Gym environments, the catch game environment must provide rewards to the agent based on the action the agent chooses. The reward function snippet from catch.cpp can be found in the main game loop. The variable currDist is the current distance to the ball and the variable prevDist is the distance to the ball found in the previous frame. Review the snippet to answer the quiz below:

// Compute reward
        float reward = 0.0f;

        if( currDist == 0 )
            reward = 1.0f;
        else if( currDist > prevDist )
            reward = -1.0f;
        else if( currDist < prevDist )
            reward = 1.0f;
        else if( currDist == prevDist )
            reward = 0.0f;

Quiz - Catch Rewards

What rewards are provided by the catch environment to the DQN agent? Check all correct boxes.

SOLUTION:
  • 0 if the ball is not getting closer or farther away
  • +1 if the ball is getting closer
  • -1 if the ball is getting farther away
  • +1 if the ball is caught

Running catch

To make sure that the DQN agent still works properly in C++, we can test the catch sample game. The package is provided in the “Test the API” GPU Workspace later in this lesson. Alternatively, it can be installed on a Jetson TX2 by following the build instructions in the repository instructions.

To test the textual catch sample, begin by opening the desktop in the “API Test” workspace, open a terminal, and navigate to the build directory with

cd /home/workspace/RoboND-DeepRL-Project/build

and run the following executable from the terminal in the build directory.

$ cd x86_64/bin
$ ./catch 

The terminal will list the initialization values, then print out results for each iteration. After around 100 episodes or so, the agent should start winning the episodes nearly 100% of the time. The following is an example output:

[deepRL]  input_width:    64
[deepRL]  input_height:   64
[deepRL]  input_channels: 1
...
WON! episode 1
001 for 001  (1.0000)  
WON! episode 5
004 for 005  (0.8000)  
...
WON! episode 110
078 for 110  (0.7091)  19 of last 20  (0.95)  (max=0.95)
WON! episode 111
079 for 111  (0.7117)  19 of last 20  (0.95)  (max=0.95)
WON! episode 112
080 for 112  (0.7143)  20 of last 20  (1.00)  (max=1.00)

Internally, catch is using the dqnAgent API from our C++ library to implement the learning.

Alternate Arguments

There are some optional command line parameters to catch that you can play around with, to change the dimensions of the environment and pixel array input size, increasing the complexity to see how it impacts convergence and training times:

$ ./catch --width=96 --height=96
$ ./catch --render  # enable text output of the environment

With 96x96 environment size, the catch agent achieves >75% accuracy after around 150-200 episodes.
With 128x128 environment size, the catch agent achieves >75% accuracy after around 325 episodes.